Everything about Method Of Least Squares totally explained
The method of
least squares, also known as
regression analysis, is used to model numerical data obtained from observations by adjusting the parameters of a model so as to get an optimal fit of the data. The best fit is that instance of the model for which the sum of
squared residuals has its
least value, a
residual being the difference between an observed value and the value given by the model. The method was first described by
Carl Friedrich Gauss around 1794. Least squares corresponds to the
maximum likelihood criterion if the experimental errors have a
normal distribution. Regression analysis is available in most
statistical software packages.
History
Context
The method of least squares grew out of the fields of
astronomy and
geodesy as scientists and mathematicians sought to provide solutions to the challenges of navigating the Earth's oceans during the
Age of Exploration. The accurate description of the behavior of celestial bodies was key to enabling ships to sail in open seas where before sailors had to rely on land sightings to determine the positions of their ships.
The method was the culmination of several advances that took place during the course of the
eighteenth century:
- The combination of different observations taken under the same conditions as opposed to simply trying one's best to observe and record a single observation accurately. This approach was notably used by Tobias Mayer while studying the librations of the moon.
- The combination of different observations as being the best estimate of the true value; errors decrease with aggregation rather than increase, perhaps first expressed by Roger Cotes.
- The combination of different observations taken under different conditions as notably performed by Roger Joseph Boscovich in his work on the shape of the earth and Pierre-Simon Laplace in his work in explaining the differences in motion of Jupiter and Saturn.
- The development of a criterion that can be evaluated to determine when the solution with the minimum error has been achieved, developed by Laplace in his Method of Situation.
The method itself
Carl Friedrich Gauss is credited with developing the fundamentals of the basis for least-squares analysis in
1795 at the age of eighteen.
An early demonstration of the strength of Gauss's method came when it was used to predict the future location of the newly discovered asteroid
Ceres. On
January 1st,
1801, the Italian astronomer
Giuseppe Piazzi discovered Ceres and was able to track its path for 40 days before it was lost in the glare of the sun. Based on this data, it was desired to determine the location of Ceres after it emerged from behind the sun without solving the complicated
Kepler's nonlinear equations of planetary motion. The only predictions that successfully allowed Hungarian astronomer
Franz Xaver von Zach to relocate Ceres were those performed by the 24-year-old Gauss using least-squares analysis.
Gauss didn't publish the method until
1809, when it appeared in volume two of his work on celestial mechanics,
Theoria Motus Corporum Coelestium in sectionibus conicis solem ambientium.
In
1829, Gauss was able to state that the least-squares approach to regression analysis is optimal in the sense that in a linear model where the errors have a mean of zero, are uncorrelated, and have equal variances, the best linear unbiased estimators of the coefficients is the least-squares estimators. This result is known as the
Gauss-Markov theorem.
The idea of least-squares analysis was also independently formulated by the Frenchman
Adrien-Marie Legendre in
1805 and the American
Robert Adrain in
1808.
Problem statement
The objective consists of adjusting the parameters of a model function so as to best fit a data set. A simple data set consists of
m points (data pairs)
,
i=1,...,
m, where
is an
independent variable and
is a
dependent variable whose value is found by observation. The model function has the form
, where the
n adjustable parameters are held in the vector
. We wish to find those parameter values for which the model "best" fits the data. The least squares method defines "best" as when the sum,
S, of squared residuals
»